260 research outputs found

    Mining social semantics on the social web

    Get PDF

    Posted, Visited, Exported: Altmetrics in the Social Tagging System BibSonomy

    Get PDF
    In social tagging systems, like Mendeley, CiteULike, and BibSonomy, users can post, tag, visit, or export scholarly publications. In this paper, we compare citations with metrics derived from users’ activities (altmetrics) in the popular social bookmarking system BibSonomy. Our analysis, using a corpus of more than 250,000 publications published before 2010, reveals that overall, citations and altmetrics in BibSonomy are mildly correlated. Furthermore, grouping publications by user-generated tags results in topic-homogeneous subsets that exhibit higher correlations with citations than the full corpus. We find that posts, exports, and visits of publications are correlated with citations and even bear predictive power over future impact. Machine learning classifiers predict whether the number of citations that a publication receives in a year exceeds the median number of citations in that year, based on the usage counts of the preceding year. In that setup, a Random Forest predictor outperforms the baseline on average by seven percentage points

    Folksonomies and clustering in the collaborative system CiteULike

    Full text link
    We analyze CiteULike, an online collaborative tagging system where users bookmark and annotate scientific papers. Such a system can be naturally represented as a tripartite graph whose nodes represent papers, users and tags connected by individual tag assignments. The semantics of tags is studied here, in order to uncover the hidden relationships between tags. We find that the clustering coefficient reflects the semantical patterns among tags, providing useful ideas for the designing of more efficient methods of data classification and spam detection.Comment: 9 pages, 5 figures, iop style; corrected typo

    Enrichment and ranking of the YouTube tag space and integration with the Linked Data cloud

    Get PDF
    The increase of personal digital cameras with video functionality and video-enabled camera phones has increased the amount of user-generated videos on the Web. People are spending more and more time viewing online videos as a major source of entertainment and “infotainment”. Social websites allow users to assign shared free-form tags to user-generated multimedia resources, thus generating annotations for objects with a minimum amount of effort. Tagging allows communities to organise their multimedia items into browseable sets, but these tags may be poorly chosen and related tags may be omitted. Current techniques to retrieve, integrate and present this media to users are deficient and could do with improvement. In this paper, we describe a framework for semantic enrichment, ranking and integration of web video tags using Semantic Web technologies. Semantic enrichment of folksonomies can bridge the gap between the uncontrolled and flat structures typically found in user-generated content and structures provided by the Semantic Web. The enhancement of tag spaces with semantics has been accomplished through two major tasks: a tag space expansion and ranking step; and through concept matching and integration with the Linked Data cloud. We have explored social, temporal and spatial contexts to enrich and extend the existing tag space. The resulting semantic tag space is modelled via a local graph based on co-occurrence distances for ranking. A ranked tag list is mapped and integrated with the Linked Data cloud through the DBpedia resource repository. Multi-dimensional context filtering for tag expansion means that tag ranking is much easier and it provides less ambiguous tag to concept matching

    Evaluating the semantic web: a task-based approach

    Get PDF
    The increased availability of online knowledge has led to the design of several algorithms that solve a variety of tasks by harvesting the Semantic Web, i.e. by dynamically selecting and exploring a multitude of online ontologies. Our hypothesis is that the performance of such novel algorithms implicity provides an insight into the quality of the used ontologies and thus opens the way to a task-based evaluation of the Semantic Web. We have investigated this hypothesis by studying the lessons learnt about online ontologies when used to solve three tasks: ontology matching, folksonomy enrichment, and word sense disambiguation. Our analysis leads to a suit of conclusions about the status of the Semantic Web, which highlight a number of strengths and weaknesses of the semantic information available online and complement the findings of other analysis of the Semantic Web landscape

    A study on text-score disagreement in online reviews

    Get PDF
    In this paper, we focus on online reviews and employ artificial intelligence tools, taken from the cognitive computing field, to help understanding the relationships between the textual part of the review and the assigned numerical score. We move from the intuitions that 1) a set of textual reviews expressing different sentiments may feature the same score (and vice-versa); and 2) detecting and analyzing the mismatches between the review content and the actual score may benefit both service providers and consumers, by highlighting specific factors of satisfaction (and dissatisfaction) in texts. To prove the intuitions, we adopt sentiment analysis techniques and we concentrate on hotel reviews, to find polarity mismatches therein. In particular, we first train a text classifier with a set of annotated hotel reviews, taken from the Booking website. Then, we analyze a large dataset, with around 160k hotel reviews collected from Tripadvisor, with the aim of detecting a polarity mismatch, indicating if the textual content of the review is in line, or not, with the associated score. Using well established artificial intelligence techniques and analyzing in depth the reviews featuring a mismatch between the text polarity and the score, we find that -on a scale of five stars- those reviews ranked with middle scores include a mixture of positive and negative aspects. The approach proposed here, beside acting as a polarity detector, provides an effective selection of reviews -on an initial very large dataset- that may allow both consumers and providers to focus directly on the review subset featuring a text/score disagreement, which conveniently convey to the user a summary of positive and negative features of the review target.Comment: This is the accepted version of the paper. The final version will be published in the Journal of Cognitive Computation, available at Springer via http://dx.doi.org/10.1007/s12559-017-9496-

    Exploiting the Social Capital of Folksonomies for Web Page Classification

    Full text link
    • …
    corecore